Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CDAT Migration: Test lat_lon set with run script and debug any issues #794

Merged
merged 16 commits into from
Mar 12, 2024

Conversation

tomvothecoder
Copy link
Collaborator

@tomvothecoder tomvothecoder commented Feb 26, 2024

Issues fixed

  • ValueError: Multiple 'Z' axis dims were found in this dataset, ['ilev', 'lev']. Please drop the unused dimension(s) before performing grid operations. -- affects _align_grids_to_lower_res() and _apply_land_sea_mask()
    • Root Cause: Some datasets have ilev coordinates with pressure variables (e.g., hyam, hybm) that are not used
    • Solution: Add _get_grid() and _drop_unused_ilev_axis()
  • KeyError: No attribute "units" -- affected cosp_bin_sum()
    • Root Cause: In cosp_bin_sum(), calling .sum() results in Xarray dropping the original units attribute. The original units are required for the "convert_units()`" function to correctly convert units to "%". Otherwise, units are None and never set to %.
    • Solution: Add keep_attrs=True to .sum() call
  • OSError: No file found for 'HadISST' and 'JJA' in /global/cfs/cdirs/e3sm/diagnostics/observations/Atm/climatologyand OSError: No file found for 'HadISST' and 'ANN' in /global/cfs/cdirs/e3sm/diagnostics/observations/Atm/climatology
    • Root cause: Trying to open a non-existent ref dataset to get the global attr to set on the parameter object.
    • Solution: Use a try and except in Dataset._get_global_attr_from_climo_dataset() to cover cases where ref files are missing and the test dataset is being used as the ref dataset
  • x and y nan location mismatching -- affects CLDTOT_TAU1.3_9.4_ISCCP, CLDTOT_TAU1.3_ISCCP, CLDTOT_TAU9.4_ISCCP, CLDTOT_TAU9.4_ISCCP, CLDLOW_TAU1.3_9.4_MISR, CLDLOW_TAU1.3_MISR, CLDLOW_TAU9.4_MISR, CLDTOT_TAU1.3_9.4_MISR, CLDTOT_TAU1.3_MISR, CLDTOT_TAU9.4_MISR, CLDHGH_TAU1.3_9.4_MODIS, CLDHGH_TAU1.3_MODIS, CLDHGH_TAU9.4_MODIS, CLDTOT_TAU1.3_9.4_MODIS, CLDTOT_TAU1.3_MODIS, CLDTOT_TAU9.4_MODIS
  • PminusE - large relative differences
    • Root cause: Attributes are not preserved due to lambda arithmetic with xarray objects
    • Solution: Define individual formula functions (pminuse_1(), pminuse_2(), pminuse_3()) and move arithmetic into those functions wrapped by with xr.set_options(keep_attrs=True):
  • QREFHT - large relative difference and shape mismatch (180, 360), (721, 1440)
    • Root cause: Missing derivation definitions in derivations.py from acme.py on main (related PR).
    • Solution: Add derivation definitions from acme.py
  • ALBEDOC (reference dataset) - x and y nan location mismatching
    • Root Cause: The albedoc() formula function divides rsutcs / rsdt. rsdt contains 0s, which results in divide by zeros. Xarray and NumPy replaces these values with np.inf instead of throwing a "Divide By Zero" error. Meanwhile CDAT replaces these values with the floats from rsutcs, but they are masked so the output will np.nan.
    • Solution: Add _replace_inf_with_nan() to update the array values for this case
  • CRU-TREFHT -- land mask is not being applied
    • Root cause: The conditional in subset_and_align_datasets() was incorrect and did not consider "land_60S90N"
    • Solution: Update the conditional from if region == "land" or region == "ocean": to if "land" in region or "ocean" in region:"
  • CRU-TREFHT -- after applying land mask, region subsetting is not being applied
    • Root cause: The conditional in subset_and_align_datasets() was incorrect and ignored subsetting
    • Solution: Update the conditional from elif region == "global": to if "global" not in region:

Differences we can accept

There seems to be a minor difference in nan locations for the regridded land-sea mask produced using xCDAT + xESMF vs. CDAT + ESMF (method="bilinear"). xCDAT has a little bit more nan values than CDAT. This causes x and y nan location mismatch when comparing array values. I don't know the exact reason why there are differences, but they seem insignificant enough to accept.

The absolute sum, mean, and nan count values are similar across variables. The diff plots also show no visible differences. All of the scripts for debugging these variables can be found here.

  • TAUXY (ocean)
xCDAT CDAT Diff
output output_2 diff
  • ERA5-TREFHT (land)
xCDAT CDAT Diff
debug_era5_trefht_actual debug_era5_trefht_expected debug_era5_trefht_diff
  • MERRA2-TREFHT (land)
xCDAT CDAT Diff
debug_merra2_trefht_actual debug_merra2_trefht_expected debug_merra2_trefht_diff

Issues on main

  • Some reference variable names don't align with main - affects
    AODVIS, FLUT, FLUTC, LCWF, RESTOM, SOLIN, SWCF, FLDS, FLDSC, FSDS, FSDSC, FSNS, FSNSC, LHFLX, SHFLX, T, U, U10, PRECT, SST

    • Root Cause: main saves the original variable name, while cdat-migration-fy24 saves the target (derived) variable name. cdat-migration-fy24 is doing the right thing here.
    • Solution: Use the derived variables dictionary in the template notebook to align variable keys before comparing.
  • The following variables have large diffs due to a bug on main: MISRCOSP-CLDLOW_TAU1.3_9.4_MISR, MISRCOSP-CLDLOW_TAU1.3_MISR, MISRCOSP-CLDLOW_TAU9.4_MISR. I believe this branch is doing the correct thing. We can revisit the comparison for these variables once the issue below is fixed on main.

Checklist

If applicable:

  • New and existing unit tests pass with my changes (locally and CI/CD build)
  • I have added tests that prove my fix is effective or that my feature works
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have noted that this is a breaking change for a major release (fix or feature that would cause existing functionality to not work as expected)

@tomvothecoder tomvothecoder changed the title CDAT Migration Phase 2: Refactor core utilities and lat_lon set (#677) CDAT Migration: Debug lat_lon set with run script Feb 26, 2024
@tomvothecoder tomvothecoder self-assigned this Feb 26, 2024
@tomvothecoder tomvothecoder added the cdat-migration-fy24 CDAT Migration FY24 Task label Feb 26, 2024
- Fix multiple Z axis dims by dropping ilev with `_drop_unused_ilev_axis()`
- Fix `cosp_bin_sum()` `.sum()` call not preserving units for unit conversion
- Fix unused ilev axis not being dropped in `_apply_land_sea_mask()`
@tomvothecoder tomvothecoder changed the title CDAT Migration: Debug lat_lon set with run script CDAT Migration: Test lat_lon set with run script and debug any issues Mar 4, 2024
@tomvothecoder
Copy link
Collaborator Author

tomvothecoder commented Mar 12, 2024

Hi @chengzhuzhang, I'm just tagging you so you're aware of the issues that were fixed in this PR, differences that we can accept, and issues found on the main branch. I will merge this PR to continue moving forward.

With the list of issues fixed, the cdat-migration-fy24 branch will be even more robust.

@chengzhuzhang
Copy link
Contributor

Hi @chengzhuzhang, I'm just tagging you so you're aware of the issues that were fixed in this PR, differences that we can accept, and issues found on the main branch. I will merge this PR to continue moving forward.

With the list of issues fixed, the cdat-migration-fy24 branch will be even more robust.

Got it. Thank you for working on this!

@tomvothecoder tomvothecoder merged commit dd4694c into cdat-migration-fy24 Mar 12, 2024
4 checks passed
@tomvothecoder tomvothecoder deleted the refactor/792-lat-lon-debug branch March 12, 2024 21:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment